可靠的异常检测对于深度学习模型的现实应用至关重要。深层生成模型产生的可能性虽然进行了广泛的研究,但仍被认为是对异常检测的不切实际的。一方面,深层生成模型的可能性很容易被低级输入统计数据偏差。其次,许多用于纠正这些偏见的解决方案在计算上是昂贵的,或者对复杂的天然数据集的推广不佳。在这里,我们使用最先进的深度自回归模型探索离群值检测:PixelCNN ++。我们表明,PixelCNN ++的偏见主要来自基于局部依赖性的预测。我们提出了两个我们称为“震动”和“搅拌”的徒转化家族,它们可以改善低水平的偏见并隔离长期依赖性对PixelCNN ++可能性的贡献。这些转换在计算上是便宜的,并且在评估时很容易应用。我们使用五个灰度和六个自然图像数据集对我们的方法进行了广泛的评估,并表明它们达到或超过了最新的离群检测性能。总而言之,轻巧的补救措施足以在具有深层生成模型的图像上实现强大的离群检测。
translated by 谷歌翻译
当用离群数据与培训分布相去甚远,深层网络通常会充满信心,但仍有不正确的预测。由深生成模型(DGM)计算出的可能性是使用未标记数据的异常检测的候选指标。然而,以前的研究表明,DGM的可能性是不可靠的,可以通过简单转换对输入数据很容易偏见。在这里,我们在最简单的DGM中检查了使用变异自动编码器(VAE)(VAE)的离群值检测。我们提出了新型的分析和算法方法,以减轻VAE可能性的关键偏见。我们的偏差校正是特定于样本的,计算便宜的,并且很容易针对各种解码器可见分布进行计算。接下来,我们表明,众所周知的图像预处理技术(对比拉伸)扩展了偏置校正的有效性,以进一步改善异常检测。我们的方法通过九个灰度和自然图像数据集实现了最先进的精度,并在最近的四种竞争方法中表现出了显着的优势 - 无论是速度和性能而言,都具有显着的优势。总而言之,轻巧的补救措施足以通过VAE实现强大的离群值检测。
translated by 谷歌翻译
The combination of conduct, emotion, motivation, and thinking is referred to as personality. To shortlist candidates more effectively, many organizations rely on personality predictions. The firm can hire or pick the best candidate for the desired job description by grouping applicants based on the necessary personality preferences. A model is created to identify applicants' personality types so that employers may find qualified candidates by examining a person's facial expression, speech intonation, and resume. Additionally, the paper emphasises detecting the changes in employee behaviour. Employee attitudes and behaviour towards each set of questions are being examined and analysed. Here, the K-Modes clustering method is used to predict employee well-being, including job pressure, the working environment, and relationships with peers, utilizing the OCEAN Model and the CNN algorithm in the AVI-AI administrative system. Findings imply that AVIs can be used for efficient candidate screening with an AI decision agent. The study of the specific field is beyond the current explorations and needed to be expanded with deeper models and new configurations that can patch extremely complex operations.
translated by 谷歌翻译
We propose the fully differentiable $\nabla$-RANSAC.It predicts the inlier probabilities of the input data points, exploits the predictions in a guided sampler, and estimates the model parameters (e.g., fundamental matrix) and its quality while propagating the gradients through the entire procedure. The random sampler in $\nabla$-RANSAC is based on a clever re-parametrization strategy, i.e.\ the Gumbel Softmax sampler, that allows propagating the gradients directly into the subsequent differentiable minimal solver. The model quality function marginalizes over the scores from all models estimated within $\nabla$-RANSAC to guide the network learning accurate and useful probabilities.$\nabla$-RANSAC is the first to unlock the end-to-end training of geometric estimation pipelines, containing feature detection, matching and RANSAC-like randomized robust estimation. As a proof of its potential, we train $\nabla$-RANSAC together with LoFTR, i.e. a recent detector-free feature matcher, to find reliable correspondences in an end-to-end manner. We test $\nabla$-RANSAC on a number of real-world datasets on fundamental and essential matrix estimation. It is superior to the state-of-the-art in terms of accuracy while being among the fastest methods. The code and trained models will be made public.
translated by 谷歌翻译
Model Predictive Controllers (MPC) are widely used for controlling cyber-physical systems. It is an iterative process of optimizing the prediction of the future states of a robot over a fixed time horizon. MPCs are effective in practice, but because they are computationally expensive and slow, they are not well suited for use in real-time applications. Overcoming the flaw can be accomplished by approximating an MPC's functionality. Neural networks are very good function approximators and are faster compared to an MPC. It can be challenging to apply neural networks to control-based applications since the data does not match the i.i.d assumption. This study investigates various imitation learning methods for using a neural network in a control-based environment and evaluates their benefits and shortcomings.
translated by 谷歌翻译
Abstractive dialogue summarization has received increasing attention recently. Despite the fact that most of the current dialogue summarization systems are trained to maximize the likelihood of human-written summaries and have achieved significant results, there is still a huge gap in generating high-quality summaries as determined by humans, such as coherence and faithfulness, partly due to the misalignment in maximizing a single human-written summary. To this end, we propose to incorporate different levels of human feedback into the training process. This will enable us to guide the models to capture the behaviors humans care about for summaries. Specifically, we ask humans to highlight the salient information to be included in summaries to provide the local feedback , and to make overall comparisons among summaries in terms of coherence, accuracy, coverage, concise and overall quality, as the global feedback. We then combine both local and global feedback to fine-tune the dialog summarization policy with Reinforcement Learning. Experiments conducted on multiple datasets demonstrate the effectiveness and generalization of our methods over the state-of-the-art supervised baselines, especially in terms of human judgments.
translated by 谷歌翻译
Line segments are ubiquitous in our human-made world and are increasingly used in vision tasks. They are complementary to feature points thanks to their spatial extent and the structural information they provide. Traditional line detectors based on the image gradient are extremely fast and accurate, but lack robustness in noisy images and challenging conditions. Their learned counterparts are more repeatable and can handle challenging images, but at the cost of a lower accuracy and a bias towards wireframe lines. We propose to combine traditional and learned approaches to get the best of both worlds: an accurate and robust line detector that can be trained in the wild without ground truth lines. Our new line segment detector, DeepLSD, processes images with a deep network to generate a line attraction field, before converting it to a surrogate image gradient magnitude and angle, which is then fed to any existing handcrafted line detector. Additionally, we propose a new optimization tool to refine line segments based on the attraction field and vanishing points. This refinement improves the accuracy of current deep detectors by a large margin. We demonstrate the performance of our method on low-level line detection metrics, as well as on several downstream tasks using multiple challenging datasets. The source code and models are available at https://github.com/cvg/DeepLSD.
translated by 谷歌翻译
Human activity recognition (HAR) using drone-mounted cameras has attracted considerable interest from the computer vision research community in recent years. A robust and efficient HAR system has a pivotal role in fields like video surveillance, crowd behavior analysis, sports analysis, and human-computer interaction. What makes it challenging are the complex poses, understanding different viewpoints, and the environmental scenarios where the action is taking place. To address such complexities, in this paper, we propose a novel Sparse Weighted Temporal Attention (SWTA) module to utilize sparsely sampled video frames for obtaining global weighted temporal attention. The proposed SWTA is comprised of two parts. First, temporal segment network that sparsely samples a given set of frames. Second, weighted temporal attention, which incorporates a fusion of attention maps derived from optical flow, with raw RGB images. This is followed by a basenet network, which comprises a convolutional neural network (CNN) module along with fully connected layers that provide us with activity recognition. The SWTA network can be used as a plug-in module to the existing deep CNN architectures, for optimizing them to learn temporal information by eliminating the need for a separate temporal stream. It has been evaluated on three publicly available benchmark datasets, namely Okutama, MOD20, and Drone-Action. The proposed model has received an accuracy of 72.76%, 92.56%, and 78.86% on the respective datasets thereby surpassing the previous state-of-the-art performances by a margin of 25.26%, 18.56%, and 2.94%, respectively.
translated by 谷歌翻译
Experimental sciences have come to depend heavily on our ability to organize, interpret and analyze high-dimensional datasets produced from observations of a large number of variables governed by natural processes. Natural laws, conservation principles, and dynamical structure introduce intricate inter-dependencies among these observed variables, which in turn yield geometric structure, with fewer degrees of freedom, on the dataset. We show how fine-scale features of this structure in data can be extracted from \emph{discrete} approximations to quantum mechanical processes given by data-driven graph Laplacians and localized wavepackets. This data-driven quantization procedure leads to a novel, yet natural uncertainty principle for data analysis induced by limited data. We illustrate the new approach with algorithms and several applications to real-world data, including the learning of patterns and anomalies in social distancing and mobility behavior during the COVID-19 pandemic.
translated by 谷歌翻译
Generative Adversarial Networks (GANs) have received wide acclaim among the machine learning (ML) community for their ability to generate realistic 2D images. ML is being applied more often to complex problems beyond those of computer vision. However, current frameworks often serve as black boxes and lack physics embeddings, leading to poor ability in enforcing constraints and unreliable models. In this work, we develop physics embeddings that can be stringently imposed, referred to as hard constraints, in the neural network architecture. We demonstrate their capability for 3D turbulence by embedding them in GANs, particularly to enforce the mass conservation constraint in incompressible fluid turbulence. In doing so, we also explore and contrast the effects of other methods of imposing physics constraints within the GANs framework, especially penalty-based physics constraints popular in literature. By using physics-informed diagnostics and statistics, we evaluate the strengths and weaknesses of our approach and demonstrate its feasibility.
translated by 谷歌翻译